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(AaomeyDoctoNo. PHNL02n56). ffledN.ven4«r20. 2002 

parameters are quantized. ^» and S7, the detetmmed 

15 



20 signal Sa-l^p.Ze.^p™!^ : '^ 

a decoding module 210 which nerfhm,cTi. ^ '^'^ ^^^^ comprises 

I 
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2 17.03.2003 
One of the challenges is to generate the monaural signal S, step S8, in such a 
way that, on decoding into the output channels, the perceived sound timbre is exactly the 
same as for Ihe ixxpvt channels. 

Several methods of generating this sum signal have been suggested previously. 
S In general these conipose a mono signal as a linear combination of the input signals. 
Particular techniques include: 

1 . Simple summation of the input signals. See for example 'Efficient representation of 
spatial audio using perceptual parametrization', by C. Faller and F. Baumgarte, 

10 WASPAA'Ol, Workshop on applications of signal processing on audio and acoustics. 
New Paltz, New York, 2001 . 

2. Weighted summation of tiie input signals using principle component analysis (PCA). See 
for example European Patent Application No. 02076408.0 filed April 10, 2002 (Attorney 

IS Docket No. PHNL020284) and European Patent ^Spplication No. 02076410.6 filed April 
10, 2002 (Attorney Docket No. PHNL0202g3). In this scheme, the squared weights of the 
summation sum up to one and the actual values depend on the relative energies in the 
input signals. 

20 3. Weighted summation with wdghts depending on the time-domain conel^^ 

the input signals. See for example 'Joint stereo coding of audio signals', by D. Sinha, 
European patent application EP 1 107 232 A2. M this method, the weights sum to +1, 
while the actual values depend on the cross-correlation of the input channels. 

25 4. US 5,701 ,346, Hene et al discloses weighted summation with energy-preservation 

scaling for downmixing left, right, and center channels of wideband signals. However, 
this is not performed as a fimction of firequency. 

These methods can be ^plied to the fiill-bandwidth signal or can be ^lied 
on band-filtered signals which all have their own weights for each firequency band. However, 
30 all methods desoibed have one drawback. If Ihe cross-correlation is fi^equency-dependent, 
which is veiy ofien the case for stereo recordings, coloration (i.e., a change of tiie perceived 
timbre) of the sound of the decoder occurs. 

This can be esqilained as follows: For a fi:equency band that has a cross- 
correlation of +1, linear summation of two input signals results in a linear addition of the 



^ ^ „^ 4e additive .ig,«. to deter,m.e fte ' 

qpad^pB^ ..eoe^, ^^^^ 

*«^^«^*«-P«.^anda.^pH^of««e«^.PWfl«»«^ 
5 ~'^>»«™ft"««ainfteque»cybaodan»u.*,-,,te^,,,^^ 

-a<^oe.ou..d.^_^.He^^^,£J^^^ 

that, in multi-channel sigi 

15 Thf>«M^ -ni, ^'^'"**"*""'**"™shftequency components. 



^ ^ ^ ""^"""^""y sucnsmnmation would avera 

Wfte,peocyc<m50.«.««dtobema«oorrel«edtha^ ^ 
■»«b.<. tt be seen a,, i^^^^ 

.c«««fie,„e.„,ydependen.co™Monofch«».Is,^ 
fflergy levels ofmoKhi^yconelated and. I v "o-yooostme 

^(«fl«r™w<„ weighed) fenowedbyapplyingaco^^^^ 
2»«J^«n«»in.^ 

sum BP to +1 tart sum to. vrfue that depends on ftecnMSMx»reteaoa 

B should be noted that although the invention can be to any system 

wl»»twoormoretwoinpntohannelsa.»comhined. """Vsj*™, 
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4 17.03.2003 
Figure 1 shows a prior art encoder; 

Figure 2 shows a block diagram of an audio system including the encoder of 

Figure 1; 

S Figure 3 shows the steps performed by a signal summation component of an 

audio cod^ according to a first embodiment of the invmtion; and 

Figure 4 shows linear interpolation of the correction factors m(f) applied by 
the summation component of Figure 3. 

10 

According to the present invention, there is provided an improved signal 
summation component (S8') , in particular for performing the step corresponding to S8 of 
Figure 1. Nonetheless, it will be seen that the invention is applicable anywhere two or more 
signals need to be summed. In a first embodiment of the invention, the summation 
1 5 conoponent adds left and right stereo channel signals prior to the summed signal S being 
encoded, step S9. 

Referring now to Figure 3 , in the first embodiment, the left (L) and right (R) 
channel signals provided to the summation component comprise multirchannel segments ml, 
m2. . . overlapping in successive time fiames t(n-l), t(n), t (n+1). Typically sinusoids, are 
20 updated at a rate of IQms and each segment ml, m2. . . is twice the length of the update rate, 
i.e. 20ms. 

For each overlapping time window t(n-l),t(n),t(n+l) for which the L^- 
channel signals are to be summed, the summation component uses a (square-root) Hanning 
window fimction to combine each chamiel signal firom overlapping segments ml,m2. . . into a 
25 resfpective time-domain signal representing each channel for a time window, step 42. 

An FFT (Fast Fourier Transform) is applied on each time-domain windowed 
signal, resulting in a respective complex frequency spectrum representation of the windowed 
signal for each chajonel, step 44. For a sampling rate of 44.1kHz and a frame length of 20ms, 
the length of the FFT is typically 882. This process results in a set of K frequency 
30 components for both input channels (L(k), R(k)). 

In the first embodiment^ the two input chamiels representations L(k) and R(k) 
are first ccnnhinedby a simple linear summation, step 46. It will be seen, however, that this 
could easily be extended to weighted summation. Thus, for the present embodiment, sum 
signal S(k) comprises: 
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S(k) = Lik)+RCk) 

Separately, the fiequency compfmeiils of the ii^t signals L(k) and R(k) are grouped into 
several fiequency bands, preferably using perceptuany-relatedbandv^^ (ERB or BARK 
scale) and, for each subband i, an eneigy-preserving correction fectorm(0 is computed, step 
5 45: 

mr({) = -^ — - . = JsL „ 

2L Equationl 



20 



25 



which can also be written as: 



10 (0 = - ssi 

2 El m ? + El R(k) r 4-2p„ (o,/Ew)iTiW "^^^^ ^ 

wifli PiaCO being the (normalized) cross-correlation of flie waveforms of subband i, a 
parameter used elsewhere in paiamehic multi-channel coders and so readily available for tiie 
calculations of Equation 2. In any case, step 45 provides a correction fector m(0 for each 
15 subband i. 

The next step 47 then comprises multiplying flie each fiequency component 
S(k:) of the sum signal with a correction filter C(k): 

SXk) = Sik)Cik) = CikWk) + Cikmk) Equation 3 



It will be seen fiom tiie last component of Equation 3 tiiat Ihe correction filter 
can be ^fied to eitiier tiie summed signal (S(k) alone or each input channel (L(k),R(k)). As 
such. Steps 46 and 47 can be combined when Ihe correction fector m(0 is known or 
peifonned separately with tiw summed signal S(k) being used in tiie det^ation of m(0, as 
indicated by the hashed line in Figure 3. 

In the preferred embodiments, tiie conection fectors m(0 are used for the 
center fiequencies of each subband, while for otiier fiequencies, tiie correction fiictors m(0 
are interpolated to provide flie conection filter C(k) for each frequency component (k) of a 
subband i. In principle, any inteipolation fimction can be used, however, empirical results 
have shown tiiat a simple linear interpolation scheme sufiBces, Figure 4. 
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^ 17.03.2003 
Alteniatively, an individual collection fector could be derived for each FFT 
Wn (ie., subband i c^nesponds to fiequency con^wment k), 4 which case no interpolation is 
necessary. This method, however, m^ result in a jagged ralher flian a smooth fiequency 
behaviour of the correction &ctors which is often undesired due to resulting time-domain 
distortions. 

In the preferred embodiments, the summationi component then takes an inverse 
FFT of the corrected summed signal S'(k) to obtain a time domain signal, step 48. By 
applying overlap-add for successive corrected summed time domain signals, step 50, flie final 
summed signal sl,s?. . . is created and this is fed Ihrough to be encoded, step S9, Kg^ l. ft 
will be seen that th^ summed segments si, s2... correspondjto &e segments ml] m2... in the 
time domain and as such no loss of synchronisation occurs is a result of flie summation. 

It wiU be seen that where the irqmt channel signals are not overiapping signals 
but rather continuous time signals, then the windowing step 42 will not be requited. 
Similarly, if flxe encoding step S9 e^cpects a contmuous time signal ratiier tiian an overlapping 
signal, tiie overlapjadd step 50 will not be required. Furthermore, it will be seen tiiat tiie 
described mefliod bf segmentation and fiequency-domainiLsfoimation can also be replaced 
by oflier ftwssibly continuous-time) filterbank-like stractures. Here, flie input audio signals 
are fed to a respective set of filters, which collectively provide an instantaneous fiequency 
spectrum represenjation for each input audio signal. This riieans tiiat sequential segments can 
in £u:t correspond witii single time samples raflier tiian bl4ks of samples as in the described 
embodiments. 

It will be seen from Equation 1 that fliere are circumstances where particular 
fiequency components for the left and right channels may cancel out one anotiier or, if fliey 
have a negative correlation, tiiey may tend to produce vejjr large correction foctor values 
m\i) for a particular band. In such cases, a sign bit coulCbe transmitted to mdicate tiiat tiie 
sum signal for the corrqponent S(k) is: 

Sik) = LQc)-R(]k) 
with a cooespoxi^g subtraction used in equations 1 or 2, 

Alternatively, tiie components for a fiequency band i might be rotated more 
into phase wifli one anotiier by an angle a(i). The ITD analysis process S3 provides tiie 
(average) phase difference between (subbands of tiie) input signals L(k) and K(k). Assuming 
fliat for a certain fiequency band / tiie phase difference between flie input signals is given by 
oc(0, file input signals L(k) and R(k) can be tiansformed|o two new mp^ signals L'(k) and 
R'(k) prior to summation according to the following: ' 
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with c being a parameter which determines the distribution of phase aligmnent between &e 
two input channels (0 < c< 1). 

In any case, it wiU be seen that where for example two channels have a 
correlation of +1 for a sub-band i, then m\i) will be and so m(0 wiU be ^A. Haus the 
coirection fector C(k) for any component in fte band / will tend to preserve the original 

energy levd by tending to lake half of each original input signal for ihe summed signal ^ 
However, as can be seen fiom Equation 1. where a frequency band / of a stereo signal 
mcludes spatial properties, tiie energy of the signal S(k) will tend to get smaDer flian if they 
were m phase, while tiie sum of fte energies of the L,R signals will tend to stay large and so 

tiie correction fector win tendtobe larger for ti.ose signals. As such. overaD energy 1^^^ 

tiie sum signal WiU stiU be preserved across ti^espectnm, in spite of fiequ^ 
conektion in the input signals. 

m a second embodiment, flie extension towards multiple (more ti«n two) input 

channels is shown, combined witi, possible weighting offte input cham^els mentioned above 
The ftequency-domain input channels are denoted by X.(k), for tiie k-fli frequency 

component of flien-tiiir^utchamiel.Tl»e frequency componentskoftiieseinpmch^^ 
are grouped in frequency bands /. Subsequenfly, a correction fector m(0 is computed for 
subband / as follows: 



In this equation, w.Ck) denote fiequenqr^endent wei^rttog ftcfans of tte 
^ d^nnel. n (wUd. can .in^ty 1« se, to +1 Ii«.r sunnnaa™ 
ftcto« .dCiX . oonecflon fflter C(k) is genemted 1^ inteipoMon of fl« 

m(i)tt, d«albodtoflH,flMenJx.dtaH«tt Tl»nfl„monoou^d.^^ 

according to: 



It will be seen tiiat usmg the above equations, die weights of the different 
chamiels donot necessarily sumtoH-l,however, the correction filter automaticaUy corrects 
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for weights that do not sum to +1 and ensures (interpolated) energy preservation in each 
fiequency band. 
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least two mput audio channels (L. R), comprismg the steps o£ 

^'^>'---gWco«esp^ 

> ^^'tomrqaesentetions for each audio channel O/klRnc^w. ^ ^^"«»cy 

cnannei Qi^jg, R(k)) to provide a set of summed 

ftequency components (S(fc)) for each sequential segment; 

for each ofsaidpluiality of sequential segments calcul^rin., 

2- ^'°°*°*»«<'^8locIatoIflrthercom^gfl«sttp5»f 
P»«idtog (42) . «,pecav. set of ,a«pW 

Lnled T"**^ ' '^'-^ «M »B <rf 

sampled Signal values comprises: o 

xespecdve «.et"' r° 
respeohve mne^te^ .ep,,^ 
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5. A method according to claim 4 fiirtha comprising the step of: 
applying overl^add (SO) to successive converted summed signal 

representations to provide a final summed signal (sl,s2). 

5 

6. A method according to claim 1 wherein two input audio channels are summed 
and wherein said correction fitctors (m(0) are determined according to the function: 

2£ \sik)\' 2^; \m-^Rik)\' 

kei kei 



10 7. A method according to claim 1 wherein two or more input audio channels (X^ 

are summed according to the fimction: 

n 

wherein C(k) is the correction fector for each frequency component and wherein said 
correction factors (m(i)) for each frequency band are determined according to the function: 



15 m'(f) 



kei 



wherein Wn(k) comprises a frequency-dependent weighting factor for each input channel. 

8. A method according to claim 7 wherein Wn(k:)=l for all input audio channels. 

20 9. A method according to claim 7 wherein Wn(k:)^l for at least some input audio 

channels. 

10. A method according to claim 7 wherein the correction &ctor for each 

frequency component (CQk)) is derived fix>m a linear interpolaticm of the correction Actors 
25 (m(i)) for at least one band. 



11. 



A method according to claim 1 further comprising the steps of: 
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for each of said plunOity of fiequency bands, detennining an indicator («(/)) of 
the phase difference between fiequency consents of said audio channels in a sequential 
segment and 

prior to summing corresponding frequency components, transforming the 
5 fiequency conqjonents of at least one of said audio channels as a fonction of said indicator 
for the fiequency band of said fijequency components. 

12. A method according to claim 1 1 wherein said tcansfimning step comprises 

operating the foUowing functions on fiequency components (L(k), R(k)) of left and right 
0 input audio channels C^,R): 

whereinO< c< 1 detemiines the distribution ofphaseaKgmnent between the said input 
channels. 



5 13. A method according to claim 1 wherein said correction fector is a function of a 

sum of energy of the fiequency components of the summed signal in said band and a sum of 
the energy of said fiequency conqwnents of the input audio channels in said band. 

14. A component (S8') for generating a monaural signal fiom a comhination of at 

least two input audio channels R), comprismg: 

a summer (46) arranged to sum, for each of aplurality of sequential segments 
(t(n)) of said audio channels (L,R), corresponding fiequency conqwne^^ 
fiequency spectrum representations for each audio channel (L(k), ROc)) to provide a set of 
summed fiequency components (S(k)) for each sequential segment; 

means for calculating (45) a correction fector (m(i)) for each of a plurality of 
fiequency bands (0 of each of said plurality of sequential segments as fonction of the energy 
of the fiequency components of the summed signal in said band (J^l Sik) 1^ ) and the energy 

Of said fiequency components of the input audio channels in said band 
(EU(*)l'+|i2(*)rt);and 



a correction filter (47) for correctmg each summed fiequency con^wnent 
fonction of the correction fector (m(i)) for the frequency band of said coirponent 



as a 
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15. An audio coder including the coniponent of claim 14. 

16. Audio system conotprising an audio coder as claimed in claim IS and a 
5 compatible audio player. 
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